Unsupervised Information Extraction Approach Using Graph Mutual Reinforcement
نویسندگان
چکیده
Information Extraction (IE) is the task of extracting knowledge from unstructured text. We present a novel unsupervised approach for information extraction based on graph mutual reinforcement. The proposed approach does not require any seed patterns or examples. Instead, it depends on redundancy in large data sets and graph based mutual reinforcement to induce generalized “extraction patterns”. The proposed approach has been used to acquire extraction patterns for the ACE (Automatic Content Extraction) Relation Detection and Characterization (RDC) task. ACE RDC is considered a hard task in information extraction due to the absence of large amounts of training data and inconsistencies in the available data. The proposed approach achieves superior performance which could be compared to supervised techniques with reasonable training data.
منابع مشابه
Cross-Language Opinion Lexicon Extraction Using Mutual-Reinforcement Label Propagation
There is a growing interest in automatically building opinion lexicon from sources such as product reviews. Most of these methods depend on abundant external resources such as WordNet, which limits the applicability of these methods. Unsupervised or semi-supervised learning provides an optional solution to multilingual opinion lexicon extraction. However, the datasets are imbalanced in differen...
متن کاملDiscovering Salience in Textual Elements using Graph Mutual Reinforcemnt SI508 Project
The problem of identifying the most salient terms and/or sentences from a set of documents has gained great interest in recent years. Identifying the set of the most salient terms is a set of documents is usually called automatic keyword extraction or terminology extraction. Extracting the most salient set of sentences from a document or a set of documents is used for extractive summarization w...
متن کاملA Comparison of Graph-Based and Statistical Metrics for Learning Domain Keywords
In this paper, we present a comparison of unsupervised and supervised methods for key-phrase extraction from a domain corpus. The experimented unsupervised methods employ individual statistical measures and graph-based measures while the supervised methods apply machine learning models that include combinations of these statistical and graph-based measures. Graph-based measures are applied on a...
متن کاملSentiment Translation through Multi-Edge Graphs
Sentiment analysis systems can benefit from the translation of sentiment information. We present a novel, graph-based approach using SimRank, a well-established graph-theoretic algorithm, to transfer sentiment information from a source language to a target language. We evaluate this method in comparison with semantic orientation using pointwise mutual information (SO-PMI), an established unsupe...
متن کاملBioNoculars: Extracting Protein-Protein Interactions from Biomedical Text
The vast number of published medical documents is considered a vital source for relationship discovery. This paper presents a statistical unsupervised system, called BioNoculars, for extracting protein-protein interactions from biomedical text. BioNoculars uses graph-based mutual reinforcement to make use of redundancy in data to construct extraction patterns in a domain independent fashion. Th...
متن کامل